home *** CD-ROM | disk | FTP | other *** search
- Path: uunet!rs
- From: rs@uunet.UU.NET (Rich Salz)
- Newsgroups: comp.sources.unix
- Subject: v10i069: Pascal to C translator, Part05/12
- Message-ID: <708@uunet.UU.NET>
- Date: 27 Jul 87 23:08:40 GMT
- Organization: UUNET Communications Services, Arlington, VA
- Lines: 1549
- Approved: rs@uunet.UU.NET
-
- Submitted-by: Per Bergsten <mcvax!enea!chalmers!holtec!perb>
- Posting-number: Volume 10, Issue 69
- Archive-name: ptoc/Part05
-
- #! /bin/sh
- # This is a shell archive. Remove anything before this line, then unpack
- # it by saving it into a file and typing "sh file". To overwrite existing
- # files, type "sh file -c". You can also feed this as standard input via
- # unshar, or by typing "sh <file", e.g.. If this archive is complete, you
- # will see the following message at the end:
- # "End of archive 5 (of 12)."
- # Contents: ptoc.doc
- PATH=/bin:/usr/bin:/usr/ucb ; export PATH
- if test -f 'ptoc.doc' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'ptoc.doc'\"
- else
- echo shar: Extracting \"'ptoc.doc'\" \(35411 characters\)
- sed "s/^X//" >'ptoc.doc' <<'END_OF_FILE'
- X.\" @(#)doc.ms 1.6 Date 87/06/29
- X.if t .rm CM
- X.ds LH Holistic Technology AB
- X.ds CH PTC
- X.ds RH HD870410-1 Rev: 1.6
- X.ds LF
- X.ds CF
- X.ds RF
- X.nr LL 6.5i
- X.nr PO 1i
- X.nr HM 0.75i
- X.nr FM 1.5i
- X.ie t .pl 842p
- X.el .pl 72v
- X.ch BT -\n(FMu/2u
- X.\" 3 tabs/inch at 12 cpi corresponds to 4 chars/tab
- X.nr t 1i/3u
- X.am DS
- X.ta \ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu +\ntu
- X..
- X.LP
- X.rs
- X.sp |8c
- X.nf
- X.ce 10
- X.if t .ps +6
- X.B "PTC implementation note"
- X.if t .ps -6
- X
- Xby
- X
- XPer Bergsten
- X
- XHolistic Technology AB
- XGrona Gatan 59
- X414 54 Gothenburg
- XSweden
- X.ce 0
- X.sp 12
- X.LP
- XThis note describes the implementation of ptc, a Pascal to C translator.
- XThe program was developed by Per Bergsten of Holistic Technology AB,
- XGothenburg, Sweden.
- XThe paper is intended to provide a guide for those who need to transport
- Xptc to a new environment,
- Xit describes how Pascal constructs are mapped onto C constructs.
- X.bp
- X.NH S 1
- XBackground
- X.LP
- X.ds CF - % -
- X.nr % 1
- XThe aim was originally to provide a simple tool for transporting finished
- Xapplications to systems lacking Pascal compilers.
- XThe first versions,
- Xconstructed in about 1984,
- Xwere capable of translating simple Pascal programs.
- XIt was never intended to become a released product,
- Xhowever,
- Xas time went by, programs and ambitions grew to the point where nearly
- Xthe full Pascal language was supported.
- XThus the program as it stands today has a long ancestry
- Xbut it has never been redesigned (which it should have been).
- X.NH 1
- XPascal vs C
- X.LP
- XTo anyone familiar with the two languages it is obvious that
- Xthey are very similar in structure.
- XThe major features may be summarised as follows:
- X.TS
- Xc c
- Xl l .
- XPascal C
- X.sp
- XBlock-structured Block-structured
- X- multiple declaration levels - single declaration level
- XStatically scoped Statically scoped
- XStrongly typed Weakly typed
- X - allows unbounded pointers
- XCall by value Mostly call by value
- X- and call by reference
- XHighly independent Highly integrated
- X- of host system - with system
- XSelf contained Allows external definitions.
- X.TE
- X.LP
- XOn the syntactic level the languages differ only in minor ways,
- Xmostly in the order in which keywords and other symbols occur,
- Xand of course in that the languages uses different symbols for the same
- Xpurposes.
- XThe only complication is that C prohibits nested subroutine declarations.
- X.LP
- XOn the semantic level the situation is worse.
- XC has the peculiarity that array variables are treated differently from other
- Xvariables,
- Xthis forces us to adopt some general way to handle array variables.
- XFurthermore, since Pascal offers nested subroutine declarations
- Xit becomes necessary to simulate part of the activation record mechanism
- Xin the translated code,
- Xin one case it is extremely difficult to completely do this.
- XIt is also clear that the C typedef mechanism has some shortcomings that
- Xpreclude an easy translation of Pascal types.
- X.bp
- X.NH 1
- XMapping Pascal to C
- X.LP
- XIn this part of the paper we briefly illustrate how to translate
- XPascal code into equivalent C code.
- X.NH 2
- XPrograms
- X.LP
- XA minimal Pascal program:
- X.DS
- Xprogram p;
- Xbegin
- Xend.
- X.DE
- Xtranslates to the C equivalent:
- X.DS
- Xextern void exit(\^);
- Xmain(\^)
- X{
- X exit(0);
- X}
- X.DE
- X.LP
- XIt should be noted here that
- Xthe translator does not actually require a complete Pascal program,
- Xthe implementation is such that any consistent set of declarations can
- Xbe translated.
- X.NH 2
- XLexical issues
- X.LP
- XThe C language uses ASCII as a carrier,
- Xalmost all of the availible characters are used.
- XThe Pascal definition uses a smaller set of characters.
- XSince few features of the languages depend on the underlying character set
- Xthis does not introduce much difficulties.
- X.LP
- XOne serious problem does occur.
- XBoth language definitions states that comments have no meaning and no
- Xclear place in the language syntax.
- XFurthermore, the Pascal definition states that a comment is equivalent
- Xto a blank character.
- XThis implies that if comments are handled accurately
- Xthe translator should also be able to collect and classify each
- Xblank character in a Pascal program and to generate a C program with the
- Xsame number of blank characters in the corresponding positions.
- XThis implication conflicts with the fact that the languages have different
- Xsyntax rules, it is not obvious what the "corresponding positions" would be.
- X.LP
- XSince comments have no defined meaning a user is free to interpret them
- Xin any way and, in particular, to associate them with the surrounding code
- Xin any way s/he chooses.
- XAlthough humans usually are able to deduce what bearing a comment has on the
- Xsurrounding program code there are no formal rules for how to do this.
- XTherefore there is no
- X.I "a priori"
- Xcorrect way to translate comments
- Xand the translator described here ignores comments altogether.
- XIf/when a reasonable
- X.I "ad hoc"
- Xsolution is found that feature may be incorporated.
- X.NH 2
- XDeclarations
- X.LP
- XThe program may introduce local declarations which are handled as follows.
- X.bp
- X.NH 3
- XLabels
- X.LP
- X.DS
- Xprogram p;
- X
- Xlabel 9;
- X
- Xbegin
- X9:
- X goto 9
- Xend.
- X.DE
- Xwhich we simply translate into:
- X.DS
- Xextern void exit(\^);
- Xmain(\^)
- X{
- XL9:
- X goto L9;
- X exit(0);
- X}
- X.DE
- X'ne 12
- X.LP
- XIf the label is reached from an inner procedure:
- X.DS
- Xprogram p;
- X
- Xlabel 100;
- X
- X procedure q;
- X
- X begin
- X goto 100
- X end;
- X
- Xbegin
- X100:
- Xend.
- X.DE
- Xa more complicated translation must be used:
- X.DS
- X# define Line __LINE__
- Xvoid Caseerror(\^);
- X# include <setjmp.h>
- Xstatic struct Jb { jmp_buf jb; } J[1];
- X
- X void
- Xq(\^)
- X{
- X longjmp(J[0].jb, 100);
- X}
- X
- Xmain(\^)
- X{
- X if (setjmp(J[0].jb))
- X goto L100;
- XL100:
- X exit(0);
- X}
- X.DE
- X.LP
- XWe assume the existence of the standard
- X.I setjmp(\^)
- Xand
- X.I longjmp(\^)
- Xlibrary functions.
- XJumpbuffers are statically allocated as needed depending on the number
- Xof declarationlevels in the program.
- X.NH 3
- XConstants
- X.LP
- XConstant declarations are treated in two different ways.
- XNumbers aliases etc are simply
- X.I "# define" 'd
- Xbut string constants are converted to static character arrays in order
- Xto avoid unnecessary duplication of string-constants in the object code,
- Xthus:
- X.DS
- Xconst
- X p = 3.14;
- X pie = '3.14';
- X.DE
- Xtranslates to:
- X.DS
- X# define pi 3.14
- Xstatic char pie[\^] = "3.14";
- X.DE
- X.NH 3
- XTypes and variables
- X.LP
- XTypes and variables are mostly easy to translate:
- X.DS
- Xprogram p;
- X
- Xconst length = 15;
- X
- Xtype
- X struct = 0 .. length;
- X vect = array [ struct ] of struct;
- X color = (red, blue, ada, yellow);
- X pointer = ^ recd;
- X recd = record
- X r : pointer;
- X case b : boolean of
- X false: (c : color);
- X true: (v : vect);
- X end;
- X
- Xvar SP : pointer;
- X
- Xbegin
- X new(SP, true);
- Xend.
- X.DE
- Xbecomes
- X.DS
- Xtypedef char boolean;
- X# define false (boolean)0
- X# define true (boolean)1
- Xextern char *Bools[\^];
- X# define Unionoffs(p, m) (((long)(&(p)->m))-((long)(p)))
- Xextern char *malloc(\^);
- X# define length 15
- Xtypedef unsigned char C47_struct;
- Xtypedef struct { C47_struct A[length + 1]; } vect;
- Xtypedef enum { red, blue, ada, yellow } color;
- Xtypedef struct S57 *pointer;
- Xtypedef struct S57 {
- X pointer r;
- X boolean b;
- X union {
- X struct {
- X color c;
- X } V1;
- X struct {
- X vect v;
- X } V2;
- X } U;
- X} recd;
- Xpointer sp;
- X
- Xmain(\^)
- X{
- X sp = (struct S57 *)malloc((unsigned)
- X (Unionoffs(sp, U.V2.v) + sizeof(sp->U.V2)));
- X exit(0);
- X}
- X.DE
- XThe rationale is as follows:
- X.LP
- XIdentifiers in the Pascal program which conflicts with reserved words in C are
- Xrenamed by adding a unique prefix Cnnn where nnn is a number.
- X.LP
- XWe also note here that uppercase letters in identifiers and keywords
- Xare translated to lowercase.
- XPascal specifies that upper/lower case is insignificant whereas C
- X(for the present) separates the two.
- XThis fact is used to advantage by the translator as all
- Xsubroutines and macros defined by the translator have an initial uppercase
- Xletter which prevents confusion.
- X.IP \-
- XThe type
- X.I boolean
- Xis a predefined Pascal type,
- Xwhen it is used the translator emits code which defines boolean to
- Xbe the smallest convenient type:
- X.I char .
- XThe constants
- X.I false
- Xand
- X.I true
- Xare defined and the vector
- X.I Bools
- Xwill contain textstrings for output if needed.
- X.IP \-
- XThe predefined types
- X.I integer
- Xand
- X.I real
- Xare likewise mapped directly onto the standard C types
- X.I int
- Xand
- X.I double
- Xthrough a typedef declaration.
- X.sp
- XInteger subranges are mapped onto standard C arithmetic types according to
- Xa short table in the translator.
- XThe table is scanned from top to bottom until an enclosing range is found
- Xand the corresponding type-name is emitted.
- X.IP \-
- XC-arrays have peculiar semantix.
- XTo unify the treatment of arrays and other datatypes we always encapsulate
- Xan array in a struct,
- Xthus an array always becomes a
- X.I struct
- Xwith a single member named A.
- X.IP \-
- XRecords and their variants are translated into C
- X.I struct
- Xand
- X.I union
- Xdefinitions.
- XSince C requires all union members to have a name we must supply artificial
- Xnames for all record variants.
- XA record with variants will therefore always contain one field with the name U
- Xwhich have sub-fields named Vnnn where nnn is a positive number.
- X.sp
- X'ne 12
- XWhen allocating dynamic storage for a record with variants naming
- Xthe desired variant
- X.DS
- Xnew(sp, true)
- X.DE
- Xwe face the problem of determining the amount of memory needed.
- X.QP
- X.B
- XC does not provide a safe way
- Xto compute the size of a particular struct variant.
- X.IP
- XThe strategy adopted to cope with this problem is to attempt to compute the
- Xoffset of a fieldmember in the variant matching the constant and then
- Xto add the size of the variant.
- XThe offset computation is expressed as a macro,
- X.I Unionoffs ,
- Xwhich uses rather foul typecasting to achive the result.
- XThe only availible alternative would be to use the same size of all variants.
- XThis method,
- Xwhile being safe,
- Xwastes memory when many small and few large
- Xrecords of the same type are created dynamically.
- X.IP \-
- XPascal enumeration types are converted directly to C
- X.I enum
- Xtypes.
- X.IP \-
- XPascal pointer types are translated into C pointer types.
- XPascal allows the programmer to construct recursive types as pointer
- Xtypes need not be fully defined until the end of the declaration-section
- Xwhere the pointer type is used.
- XIn practice this is only used to introduce record types which
- Xcontain pointers to themselves.
- XThis problem is partially solved by introducing a name for the record type.
- X.ne 10
- XHence
- X.DS
- Xtype
- X ptr = ^ node;
- X node = record
- X next : ptr;
- X info : integer
- X end;
- X.DE
- Xbecomes
- X.DS
- Xtypedef struct S56 * ptr;
- Xtypedef struct S56 {
- X ptr next;
- X integer info;
- X} node;
- X.DE
- Xwe note in passing that the problem cannot be completely solved since
- X.DS
- Xtype pureptr = ^ pureptr;
- X.DE
- Xwhich is valid Pascal, cannot be expressed in C.
- X.ne 10v
- X.IP \-
- XA pascal set-type does not have any direct counterpart in C.
- XThe C language does however have a adequate set of operators for
- Xbit manipulation.
- XWe use these to implement a Pascal set as an array of
- X.I setword .
- XSo:
- X.DS
- Xtype
- X s = set of 1 .. 100;
- X
- Xvar
- X ss : s;
- X.DE
- Xis translated into:
- X.DS
- Xtypedef unsigned short setword;
- Xtypedef struct { setword S[8]; } s;
- X
- Xs ss;
- X.DE
- XThe situation is slightly complicated by the fact that Pascal has
- Xa set constructor which permits the construction of arbitrary large sets,
- Xfor example:
- X.DS
- Xs := [ 50 .. maxint ] * [ 1 .. 80 ]
- X.DE
- Xfor that reason the first member in the array of words gives
- Xsize of the set (in words).
- XIn the example above s.S[0] would have the value 7,
- Xand s.S[1] through s.S[7] would hold the bits.
- XThe number 7 is computed on the assumption that the type
- X.I "unsigned short"
- Xon the target host is sufficiently large to holds 16 bits.
- XThe set operators of Pascal are implemented as C functions returning
- Xpointers to arrays of setwords,
- Xthe intermediary results are held in a static area of fixed size.
- X.IP \-
- XPascal files are implemented using the standard i/o package availible
- Xin most C implementations.
- XSince Pascal has the requirement that the next element of a file is visible
- Xin the filebuffer before it is read,
- Xand the requirement that linemarkers in textfiles are given special treatement
- Xwe are forced to extend the
- X.I FILE
- Xtype provided in
- X.I <stdio.h> .
- X.ne 20
- XHence:
- X.DS
- Xvar f : text;
- X.DE
- Xbecomes
- X.DS
- Xtypedef struct {
- X FILE *fp;
- X unsigned short
- X eoln:1,
- X eof:1,
- X init:1,
- X out:1,
- X :12;
- X char buf;
- X} text;
- Xtext f;
- X.DE
- Xwhere
- X.I buf
- Xis our filebuffer and
- X.I eoln ,
- X.I eof
- Xand
- X.I init
- Xare flags giving the state of the file.
- XAll Pascal i/o operations are implemented using macros that maintain the flags
- Xand the buffer variable.
- XThe actual reading and writing of data is deferred to the standard i/o package.
- X.NH 3
- XProcedures and functions
- X.LP
- XPascal procedures and functions are somewhat difficult to translate to C.
- XThe main problems lie in nested declarations and procedural parameters.
- XNested declarations are handled in the following manner:
- X.RS
- X.IP \-
- XIf procedure B is declared in procedure A,
- Xthen procedure B is emitted before A and A is forward declared before B.
- X.IP \-
- XAny constants and types declared in A is moved to the global scope,
- Xthis may force renaming of those objects.
- X.IP \-
- XAny variable declared in A
- X.I
- Xand used in B
- X.R
- Xis converted to a pointer and moved to the global scope.
- XThe pointer value is saved and re-initialized upon each entry of A
- Xand restored upon exit from A.
- X.RE
- X.br
- X'ne 20
- X.LP
- XHence:
- X.DS
- Xprocedure A;
- X
- Xconst limit = 20;
- X
- Xtype loctyp = 0 .. limit;
- X
- Xvar i, j : loctyp;
- X
- X procedure B(k : loctyp);
- X
- X begin
- X j := k + 2
- X end;
- X
- Xbegin
- X B(i)
- Xend;
- X.DE
- Xbecomes
- X.DS
- Xtypedef unsigned char loctyp;
- Xloctyp *G56_j;
- X
- Xvoid a(\^);
- X
- X void
- Xb(k)
- X loctyp k;
- X{
- X (*G56_j) = k + 2;
- X}
- X
- X void
- Xa(\^)
- X{
- X# define limit 20
- X loctyp i, j;
- X loctyp *F57;
- X
- X F57 = G56_j;
- X G56_j = &j;
- X b(i);
- X G56_j = F57;
- X}
- X.DE
- Xwe see that references to
- X.I j
- Xinside procedure
- X.I b
- Xare replaced by the pointer
- X.I G56_j
- Xwhich points to the local variable
- X.I j
- Xin procedure
- X.I a.
- XIn order to preserve the proper semantix in the face of recursion
- Xthe value of the pointer need also be saved in the local variable
- X.I F57
- Xduring the invocation of
- X.I a .
- X.IP \-
- XProcedure parameters offer little problems.
- XWe translate Pascal value-parameters into ordinary C parameters
- Xand Pascal var-parameters are treated as pointers.
- X.br
- X'ne 20
- X.IP \-
- XProcedural parameters appear at first to be easy to handle.
- XThe ordinary program:
- X.DS
- Xprogram p;
- X
- Xprocedure pp(procedure q(i : integer));
- X
- Xbegin
- X q(2)
- Xend;
- X
- Xprocedure qq(i : integer);
- Xbegin
- Xend;
- X
- Xbegin
- X pp(qq)
- Xend.
- X.DE
- Xbecomes
- X.DS
- Xextern void exit(\^);
- Xtypedef int integer;
- X
- X void
- Xpp(q)
- X void (*q)(\^);
- X{
- X (*q)(2);
- X}
- X
- X void
- Xqq(i)
- X integer i;
- X{
- X}
- X
- Xmain(\^)
- X{
- X pp(qq);
- X exit(0);
- X}
- X.DE
- Xwhich looks simple enough.
- X.br
- XHowever,
- XPascal requires that the scope of a procedure
- X.I "remains unchanged"
- Xthroughout its lifetime.
- X.ne 35
- XConsider:
- X.DS
- Xprogram demo(output);
- X
- Xvar i : integer;
- X
- X procedure p(procedure q);
- X
- X var j : integer;
- X
- X procedure qq;
- X
- X begin
- X writeln(j)
- X end;
- X
- X begin
- X j := i;
- X q;
- X if i < 1 then
- X begin
- X i := i + 1;
- X p(qq)
- X end
- X end;
- X
- X procedure dummy;
- X begin
- X end;
- X
- Xbegin
- X i := 0;
- X p(dummy)
- Xend.
- X.DE
- XWhen
- X.I p
- Xis first invoked it assigns the local variable
- X.I j
- Xthe value 0.
- XThis variable is accessible from
- X.I qq
- Xwhich is passed as a parameter in
- Xthe recursive call of
- X.I p .
- XThe second invocation of
- X.I p
- Xthen sets
- X.I its
- Xvariable
- X.I j
- Xto 1,
- Xand calls
- X.I q
- Xwhich is bound to the first instance of
- X.I qq ,
- Xand should therefore print the number
- X.I 0 .
- X.B
- XSadly,
- Xthe code produced by the translator fails to do this.
- X.R
- XIt is obvious that the program above calls for a complete simulation
- Xof the activation record mechanism of Pascal to work correctly.
- X.RS
- X.LP
- XA workable but unpractical solution would be:
- X.IP 1)
- XWhen qq is used as parameter we call a function q1 which saves the environment
- Xfor qq (i.e. the address of j) in a well known place
- Xand returns a pointer to q2.
- X.IP 2)
- XWhen qq is later called (under the name q) the actual target will be q2
- Xwhich sets up the proper environment calls qq.
- X.RE
- X.IP
- XThe problem is that this requires a save-area for
- X.I each
- Xprocedural parameter which can hold the intresting
- Xparts of its environment for
- X.I each
- Xof its invocations.
- XIn the example above we need one are which holds the address of i
- Xfor each instance of qq
- X(say N instances, where N depends on the run-time behaviour of p).
- XIt also requires a set of different procedures to perform the work of q2
- X(N different procedures which sets up the environment for the N instances).
- X.B
- XThis requires much to much work to implement so the problem is left unsolved,
- X.R
- Xthis is hardly a problem in practice since humans rarely write such code
- Xbut
- X.B
- Xit could introduce problems
- X.R
- Xin machine generated Pascal code.
- X.IP
- XIt should be noted that
- Xthe translator accepts the keyword
- X.B external
- Xin place of the Pascal
- X.B forward
- Xdirective and assumes that the so declared procedure is defined elsewhere.
- X.bp
- X.NH 2
- XStatements.
- X.LP
- XPascal statements are comparatively easy to translate to C.
- XThe only parts that require special care are
- Xnon-local
- X.I goto ,
- X.I with
- Xand
- X.I for
- Xstatements.
- XThe code
- X.DS
- Xprogram p(output);
- X
- Xtype
- X msgtyp = packed array [ 1 .. 12 ] of char;
- X
- Xvar
- X a : array [ 1 .. 10 ] of
- X record
- X r : real
- X end;
- X i : integer;
- X ok : boolean;
- X
- X procedure fatal(m : msgtyp);
- X
- X begin
- X writeln('Fatal error: ', m)
- X end;
- X
- Xbegin
- X while true do
- X repeat
- X ok := false;
- X i := 1;
- X for i := i + 1 to i + 10 do
- X if i > 10 then
- X fatal(' 10 exceeded')
- X else
- X with a[i] do
- X if r > 9.99e+38 then
- X ok := true
- X else
- X writeln(r)
- X until ok
- Xend.
- X.DE
- X'ne 20
- Xbecomes
- X.DS
- Xtypedef char boolean;
- X# define false (boolean)0
- X# define true (boolean)1
- Xtypedef int integer;
- Xtypedef double real;
- X
- Xtypedef struct { char A[12 - 1 + 1]; } msgtyp;
- Xtypedef struct { struct S57 {
- X real r;
- X} A[10 - 1 + 1]; } T56;
- XT56 a;
- Xinteger i;
- Xboolean done;
- X
- X void
- Xfatal(m)
- X msgtyp m;
- X{
- X (void)fprintf(output.fp, "Fatal error: %.12s", m.A),
- X Putchr('\en', output);
- X}
- X
- Xmain(\^)
- X{
- X while (true)
- X do {
- X done = false;
- X i = 1;
- X {
- X integer B1 = i + 1, B2 = i + 10;
- X
- X if (B1 <= B2)
- X for (i = B1; ; i++) {
- X if (i > 10)
- X fatal(*((msgtyp *)" 10 exceeded"));
- X else {
- X register struct S57 *W3 = &a.A[i - 1];
- X
- X if (W3->r > 9.99e+38)
- X done = true;
- X else
- X (void)fprintf(output.fp, "% 20e", W3->r),
- X Putchr('\en', output);
- X }
- X if (i == B2) break;
- X }
- X }
- X } while (!(done));
- X exit(0);
- X}
- X.DE
- Xfor the following reasons:
- X.NH 3
- XBegin
- X.LP
- XThe compound statements
- Xare very similar in the two languages and need no further explanation.
- X.NH 3
- XIf
- X.LP
- XPascal if-statements
- Xhave the same structure and semantix as C if-statments.
- X.NH 3
- XCase
- X.LP
- XPascal case-statements
- Xhave the same structure and semantix as C switch-statements provided that a
- X.I break
- Xis always added to each entry.
- X.LP
- XThe translator supports a common Pascal extension
- Xin that it recognizes the keyword
- X.B otherwise
- Xto signify a default entry in a case-statement.
- X.NH 3
- XLabels
- X.LP
- XPascal labeled statements and labels have the same structure and semantix as
- XC labeled statements except that Pascal labels are numbers where C labels
- Xare identifiers,
- Xthis difference is solved by simply prefixing the labels with the letter
- X.I L .
- X.NH 3
- XGoto
- X.LP
- XPascal goto-statements
- Xhave the same structure as C goto statements but the semantix differ in
- Xthat Pascal allows a goto to lead out of the current procedure.
- XThis is implemented using the
- X.I setjmp/longjmp
- Xlibrary functions of C as described earlier.
- X.NH 3
- XWith
- X.LP
- XThe with-statement
- Xof Pascal has no counterpart in C.
- XIt is translated into nested compund statements where pointervariables,
- Xreferencing the corresponding records,
- Xare declared and initialized.
- XWithin the scope of the with-statement,
- Xthe accessible record fields are renamed to include the pointer.
- XThe effect of this is that the record address is evaluated once as the
- XPascal standard requires.
- X.NH 3
- XFor
- X.LP
- XThe for-statement
- Xof Pascal has a structure that is similar to the C for-statement
- Xbut the semantix differ completely.
- XPascal requires that a loop be exited when the upper bound
- Xhas been reached,
- XPascal also requires that the loop-boundaries be evaluated exactly once.
- XThe standard C for-loop is exited when the loop-condition becomes false.
- XThis implies that it is not always true that
- X.DS
- Xfor (i = 0; i <= e; i++) ;
- X.DE
- Xbehaves in the same manner as
- X.DS
- Xfor i := 0 to e do ;
- X.DE
- Xsince (in most implementations)
- Xthe C version becomes an infinite loop if
- X.I e
- Xequals
- X.I maxint
- Xor if
- X.I e
- Xis the expression
- X.I i .
- XFor that reason Pascal for-statments are translated into compound statements
- Xwhere the upper/lower bounds of the for-loop are held in local variables.
- XIt is also necessary to add a conditional break-statement at
- Xthe end of the loop.
- XIt is possible to obtain the more relaxed translation by configuring the
- Xtranslator appropriately (see "Tuning" below).
- X.NH 3
- XWhile
- X.LP
- XThe while-statement behaves exactly the same in both languages.
- X.NH 3
- XRepeat
- X.LP
- XThe repeat-statement of Pascal matches the do-while statement of C except
- Xfor the trivial difference that Pascal permits a statement-list
- Xwhere C permits a single statment in the loop.
- X.NH 3
- XEmpty
- X.LP
- XThe empty statement has (of course) the same structure and semantix
- Xin both languages.
- X.NH 2
- XExpressions and assignments
- X.LP
- XIn most cases Pascal expressions can be literally translated into equivalent
- XC expressions.
- X.IP identifiers 15
- XExcept where identifiers clash with reserved words or with other
- Xidentifiers declared in the same scope,
- Xthey may simply be printed.
- XIn some cases the translator is forced to rename identifiers or to
- Xinvent new identifiers.
- X.IP operators
- XThe operators +, -, *
- X.I div
- Xand
- X.I mod
- Xwhen applied to real or integer operands
- Xhave exact counterparts in C and are therefore easy to handle.
- XThe operator for real division, /,
- Xcorresponds to the same C operator except that the operands may be integers.
- XIn that case a cast is necessary.
- XWhen the operands are sets the expression is converted into a function call.
- X.sp
- XThe operators <, <=, >, >=, = and <> all have exact counterparts in C
- Xfor integer and real operands.
- XMost C implementations disallows
- X.I enum
- Xoperands,
- Xthe translator therefore casts such operands to
- X.I int .
- XComparisons on structures (i.e. strings or sets)
- Xare converted into function calls.
- X.IP assignments
- XAssignments are straightforward to handle since arrays are encapsulated
- Xin structures.
- XTherefore:
- X.DS
- Xa := b
- X.DE
- Xbecomes
- X.DS
- Xa = b
- X.DE
- X.I unless
- Xb is a string or a set,
- Xin which case the assignment is converted into a function call.
- X.IP indexing
- XGiven the translation for array declarations (described above) we are forced
- Xto translate
- X.DS
- Xa[i]
- X.DE
- Xinto
- X.DS
- Xa.A[i - c]
- X.DE
- Xwhere
- X.I c
- Xis the lower bound for the index type.
- X.IP selection
- XGiven the translation for records with variants (described above) the
- Xtranslation of
- X.DS
- Xa.b
- X.DE
- Xbecomes
- X.DS
- Xa.b
- X.DE
- X.I or ,
- Xif b is declared in a variant,
- X.DS
- Xa.Vxxx.b
- X.DE
- Xwhere Vxxx is a name manufactured by the translator for the particular variant.
- X.IP dereferencing
- XPointer references and
- X.B var -parameters
- Xare handled by prefixing the expression with an asterisk,
- Xbut the special case dereferencing followed by selection is also recognized,
- Xso:
- X.DS
- Xp^ := q^.f
- X.DE
- Xbecomes
- X.DS
- X*p = q->f
- X.DE
- X.IP miscellanea
- XThe boolean operators
- X.B and ,
- X.B or
- Xand
- X.B not
- Xare simply translated into their C counterparts.
- XThe set contructors
- X.B "[\^]" ,
- Xand
- X.B ".."
- Xand the operator
- X.B in
- Xare converted to function calls.
- X.bp
- X.NH 1
- XImplementation
- X.LP
- XThe general strategy is to convert the Pascal source program
- Xinto a parsetree.
- XThe tree is then traversed by a set of procedures that perform some
- Xnecessary transformations.
- XThe tree is finally traversed by a set of procedures that print the
- Xcorresponding C constructs.
- XThe translator consists of three major procedures that perform
- Xthese actions.
- XThey are augmented by a set of procedures that maintain a symbol table
- Xthat holds information about identifiers found in the source,
- Xand by a procedure that initializes all internal datastructures.
- X.LP
- XThere are three major datastructures that interact in complicated ways:
- X.IP 1)
- Xa store for identifiers and strings
- X.IP 2)
- Xa multilevel symbol table
- X.IP 3)
- Xa parse-tree.
- X.LP
- XThe identifier and string store,
- X.B strstor
- Xis in principle an array of characters that grow
- Xin increments of one string block.
- XExactly one copy of each identifier is stored in that array.
- XWhenever an identifier is found in the input stream the array is
- Xscanned to see if that identifier has been seen before.
- XIn order to speed up the search all identifiers are represented by nodes
- Xwhich are chained together such that all nodes in a particular chain
- Xhave the same hashvalue as determined by the function
- X.B hash .
- XEach
- X.B idnode
- Xholds an index to strstor where the corresponding identifier text is stored.
- XThe start of the hashchains are found in the global variable
- X.B idtab .
- X.de Ds
- X.nr x \\w'\\$2'u
- X.ie t \{
- X.nr x \\nx/2
- X.ds \\$1 \\\\h'-\\nxu'\\$2\\\\h'-\\nxu'
- X.\}
- X.el .ds \\$1 \\$2\\\\h'-\\nxu'
- X..
- X.ds l \-
- X.ie t .ds a \z\*l\a
- X.el .ds a \a\z\*l
- X.nr x \ntu/2
- X.ds g \z\*l\\l'\nxu\&\*l'
- X.ds h \\h'\nxu'
- X.Ds c +
- X.Ds d \(da
- X.Ds u \(ua
- X.Ds r \(br
- X.ds s \\*r\z\*l
- X.ds > \*l\z\*l\(->
- X.ds < \(<-\\h'-1n'\*l
- X.ds k \\kx
- X.ds b \\h'|\\nxu'
- X.ds t \t
- X.DS L
- X.lc \*l
- X.fc #
- Xidtab
- X\*c\*a\*c
- X\*r\*t\*r\*tchain of idnodes with same hashvalue
- X\*c\*a\*c\*t\*c\*a\*a\*c\*t\*c\*a\*a\*c
- X\*k\*r\*t\*r\*b\*h\*a#\*> #\*r\*t\*k\*t\*r\*b\*h\*a#\*> #\*r\*t\*t\*r idnode representing
- X\*c\*a\*c\*t\*r\*t\*t\*r\*t\*k index=2\*b\*r\*t\*t\*r identifier "demo"
- X\*r\*t\*r\*t\*c\*a\*a\*c\*t\*c\*a\*a\*c
- X
- X\*tstrstor
- X\*c\*a\*a\*a\ \*l \*a\*a\*c
- X\*r\*t\*r\*t\*r\*t\*t\*r\*t\*r
- X\*k\*c\*a\*a\*a\ \*l \*a\*a\*c\*b\*h\*r
- X\*h\*r
- X\*h\*d
- X\*c\*a\*a\*a\*a\*a\*a\*a\*a
- X\*r\*t\*r\*t\*r# d #\*r# e #\*r# m #\*r# o #\*r# / #\*r
- X\*c\*a\*a\*a\*a\*a\*a\*a\*a\*tfirst idblock
- X\*t\*t# \*u #
- X\*t\*tindex=2
- X.DE
- X.LP
- XSo:
- Xthe global representation of the identifier "demo" is a particlular
- Xidnode;
- Xwhenever the lexical analysis routine
- Xrecognizes the identifier "demo" it returns a pointer to that idnode.
- X.LP
- XIn order to represent different identifiers with the same name we need to
- Xbe able to distinguish between identifiers declared in different scopes.
- XThis is accomplished by the
- X.B symnode
- Xstructures.
- XWhen an identifier is first encountered in a given scope it is "declared",
- Xmeaning that a new symnode is created that references the identifier.
- XOccurrences of the same identifier in that scope are then represented in
- Xthe parse-tree by treenodes referencing the same symnode.
- X.bp
- XThe program:
- X.DS
- Xprogram p;
- X
- Xvar demo : integer;
- X
- Xbegin
- X demo := 3
- Xend.
- X.DE
- Xyilds the following structure:
- X.DS L
- X.lc \*l
- X.fc #
- X\*t# top #
- X\*t# \*d #
- X\*t\*c\*a\*a\*a\*a\*c treenode representing
- X\*t\*k npgm\*b\*r\*t\*t\*t\*t\*r the program
- X\*t\*c\*a\*s\*a\*s\*a\*s\*a\*c
- X\*t\*t\*r\*t\*r\*h\*u\*t\*r\*h\*u
- X\*t\*t\*r\*t\*r\*h\*r\*t\*r\*h\*c\*a\*a\*a\*a\*a\*a\*a\*g\*c
- X\*t\*t\*r\*t\*r\*h\*r\*t\*c\*a\*a\*a\*a\*a\*a\*a\*c\*h\*r
- X\*t\*t\*r\*t\*d\*h\*r\*t\*t\*t\*t\*t\*t\*t\*t\*r\*h\*r
- X\*t\*t\*r\*h\*c\*a\*g\*s\*a\*a\*c\*k treenode representing\*b\*t\*t\*t\*t\*t\*t\*r\*h\*r
- X\*t\*t\*r\*h\*k nvar\*b\*r\*t\*t\*t\*\*kr the var-declaration\*b\*t\*t\*t\*t\*t\*t\*d\*h\*r
- X\*t\*t\*r\*h\*c\*g\*s\*a\*s\*a\*c\*t\*t\*t\*c\*a\*a\*a\*g\*s\*a\*c treenode repr.
- X\*t\*t\*r\*t\*r\*h\*u\*t\*r\*t\*t\*t\*t\*k nassign\*b\*r\*t\*t\*t\*t\*r assignment
- X\*t\*t\*r\*t\*r\*h\*r\*t\*c\*a\*k\*> to type\*b\*t\*t\*t\*c\*a\*s\*a\*a\*s\*a\*c
- X\*ksymtab\*b\*t\*t\*r\*t\*d\*h\*r\*t\*t\*t\*t\*t\*t\*d\*h\*u\*t\*t\*d\*h\*u
- X\*c\*a\*c\*t\*r\*h\*c\*a\*g\*s\*a\*c\*k treenode repr.\*b\*t\*t\*t\*t\*c\*a\*g\*s\*a\*c\*h\*c\*a\*g\*s\*a\*a\*c
- X\*r\*t\*r \*<\*a\*c\*h\*k nid\*b\*r\*t\*t\*r\*k occurrence of\*b\*t\*t\*t\*t\*k nid\*b\*r\*t\*t\*r\*h\*k ninteger\*b\*r\*t\*t\*t\*r
- X\*t\*t\*h\*c\*a\*s\*a\*c\*k id. "demo"\*b\*t\*t\*t\*t\*c\*a\*s\*a\*c\*h\*c\*a\*a\*s\*a\*c
- X\*r\*t\*r\*t\*t\*r\*h\*u\*t\*t\*t\*t\*t\*t\*r\*t\*t\*t\*r\*h\*u
- X\*c\*a\*c\*t\*t\*r\*h\*r\*t\*c\*a\*a\*a\*a\*a\*c\*t\*t\*t\*r\*h\*r
- X\*r\*t\*r\*t\*t\*d\*h\*r\*t\*d\*t\*t\*t\*t\*t\*t\*t\*t\*d\*h\*r
- X\*c\*a\*c\*t\*c\*a\*g\*s\*a\*a\*c\*k symnode representing\*b\*t\*t\*t\*t\*t\*t\*c\*a\*g\*s\*a\*g\*c
- X\*r\*k\*t\*r\*b\*h\*a\*>\*t\*k lidentifier\*b\*r\*t\*t\*t\*r\*k identifier "demo"\*b\*t\*t\*t\*t\*t\*t\*k linteger\*b\*r\*t\*t\*h\*r
- X\*c\*a\*c\*t\*c\*a\*s\*a\*a\*c\*k in the current scope\*b\*t\*t\*t\*t\*t\*t\*k linum=3\*b\*r\*t\*t\*h\*r
- X\*t\*t\*t\*r\*t\*t\*t\*t\*t\*t\*t\*t\*c\*a\*a\*g\*c
- X\*kidtab\*b\*t\*t\*t\*c\*a\*a\*a\*c
- X\*c\*a\*c\*t\*t\*t\*t\*t\*r
- X\*r\*t\*r\*t\*t\*t\*t\*t\*d
- X\*c\*a\*c\*t\*c\*a\*a\*c\*t\*c\*a\*a\*c
- X\*k\*r\*t\*r\*b\*h\*a#\*> #\*r\*t\*k\*t\*r\*b\*h\*a#\*> #\*r\*t\*t\*r idnode representing
- X\*c\*a\*c\*t\*r\*t\*t\*r\*t\*k index=2\*b\*r\*t\*t\*r identifier "demo"
- X\*r\*t\*r\*t\*c\*a\*a\*c\*t\*c\*a\*a\*c
- X
- X\*tstrstor
- X\*c\*a\*a\*a\ \*l \*a\*a\*c
- X\*r\*t\*r\*t\*r\*t\*t\*r\*t\*r
- X\*c\*g\*s\*a\*a\*a\ \*l \*a\*a\*c
- X\*h\*r
- X\*h\*d
- X\*c\*a\*a\*a\*a\*a\*a\*a\*a
- X\*r\*t\*r\*t\*r# d #\*r# e #\*r# m #\*r# o #\*r# / #\*r
- X\*c\*a\*a\*a\*a\*a\*a\*a\*a\*tfirst idblock
- X\*t\*t# \*u #
- X\*t\*tindex=2
- X.fc
- X.lc
- X.DE
- XWe see that the two occurrences of the identifier "demo" are represented by
- Xtwo
- X.B treenodes
- Xof variant
- X.B nid
- Xthat both have pointers to the same
- X.B symnode
- Xrepresenting the declaration of "demo".
- XAll symnodes at a given declarationlevel are chained together (in the
- Xsame manner as the idnodes) and the chains are attached to the global
- Xvariable
- X.B symtab .
- XIn order to find out if an identifer is declared in the current scope we
- Xscan the chain of symnodes starting from symtab, and check if any of them
- Xreferences the idnode we are looking for.
- XA symnode also have a pointer to the treenode which defines the symbol.
- XThe
- X.B symtabs
- Xthemselves are also chained,
- Xthe chain implements a stack of declarationlevels.
- XThe symtab which is created when the scope of a procedure is entered
- Xis also attached to that procedure.
- XWhen a procedure is entered we create a new symtab, push it onto the stack,
- Xparse the procedure and pop the stack.
- XThe symbols then visible are those that still can be reached from the stack.
- X.LP
- XThe parse-tree consists of
- X.B treenodes
- Xrepresenting each declaration, statement, expression etc.
- XEach node except the nprogram node
- Xhas a pointer to its immediate father (giving the defining point),
- Xto its children (if it has any) and to its right brother (if it is
- Xa member of a list).
- XThe top of the tree is found in the global variable
- X.B top .
- X.LP
- XIn order to find the defining point for the identifier in the assignment,
- Xwe follow pointers from the nassign-treenode
- Xto the nid-treenode, to the lidentifier-symnode,
- Xand then up to the nid-treenode in the declaration.
- XIf we want to know the type of the identifier
- Xwe proceed up to the nvar-treenode
- Xand then down to the node giving the type in the declaration
- X(in our example that node would also be a nid-treenode referencing a
- Xlinteger-symnode that represents the identifier "integer").
- XThere is a function
- X.B typeof
- Xthat performs exactly this operation.
- XIt will return a pointer to a npredef-, nsubrange-, nscalar-, nptr-
- Xnarray-, nrecord-, nfileof- or nsetof-treenode.
- XIn those cases where the translator pays attention to types
- Xit uses a pointer (obtained from typeof) as representation of a type.
- X.LP
- XGiven the parse-tree and the layered symbol table
- Xit is not hard to see how the translations described earlier are performed.
- XThe one place where the code becomes more than usually complex is when a
- X.B write
- Xstatement with format specifications is to be translated into a call to
- X.B fprintf .
- X.bp
- X.NH 1
- XTuning
- X.LP
- XThe behaviour of the translator may be modified for some cases simply
- Xby changing constants.
- XThese constants are all located at the top of the program text.
- X.IP output 12
- XThe translator will copy the source during input if the constant
- X.B echo
- Xis true.
- XThe copy is preceeded by the line
- X.DS
- X# ifdef PASCAL
- X.DE
- Xand ended by the line
- X.DS
- X# else
- X.DE
- Xand then follows the translated code
- Xand finally
- X.DS
- X# endif
- X.DE
- XThis may be used to hold both Pascal and C source in the same file.
- XIf the Pascal part is changed the C part may be updated through:
- X.DS
- X cpp -P -DPASCAL < orig > tmp
- X ptc < tmp > new
- X move new orig
- X.DE
- Xassuming that
- X.B echo
- Xis true and that
- X.B cpp
- Xis the standard C preprocessor.
- X.DS
- XDefault value:
- X
- X echo = false;
- X.DE
- X.IP comments
- XThe translator recognizes both (* and { as comment delimiters.
- XThey are treated as different,
- Xallowing 1 level nested comments,
- Xif the constant
- X.B diffcom
- Xis true.
- X.DS
- XDefault value:
- X
- X diffcomm = false;
- X.DE
- X.IP symbols
- XThe translator accepts default entries in case-statements provided
- Xthat the keyword defined through
- X.B othersym
- Xis used in place of the constant list.
- X.DS
- XDefault value:
- X
- X othersym = 'otherwise ';
- X.DE
- Xsubstitute for
- X.DS
- X othersym = 'otherwise%';
- X.DE
- Xif that feature is undesired.
- X.IP
- XThe translator accepts externally declared procedures and functions
- Xprovided that the directive defined through
- X.B externsym
- Xis used in place of the keyword
- X.B forward .
- X.DS
- XDefault value:
- X
- X externsym = 'external ';
- X.DE
- X.IP sets
- XSets are implemented as arrays of
- X.B wordtype .
- XThe type is assumed to hold
- X.B "setbits + 1"
- Xbits numbered from 0.
- X.DS
- XDefault value:
- X
- X wordtype = 'unsigned short';
- X setbits = 15;
- X.DE
- X.ne 10
- X.IP files
- XThe implementation of files uses a flag-word which has the type given as
- X.B filebits
- Xwhich is assumed to hold
- X.B "filefill + 4"
- Xbits.
- X.DS
- XDefault value:
- X
- X filebits = 'unsigned short';
- X filefill = 12;
- X.DE
- X.ne 20
- X.IP stmts
- XIf the Pascal source is known to be "normal" in the sense that for-loops
- Xalways have an upper bound that is less than the maximum value of the
- Xtype of the loop-variable, and in the sense that the upper bound doesn't
- Xchange by computations inside the loop, then the translator may generate
- Xa more natural code.
- XI.e:
- X.DS
- Xfor i := 1 to 10 do ;
- X.DE
- Xbecomes
- X.DS
- Xfor (i = 1; i <= 10; i++) ;
- X.DE
- XSince the requirements cannot be determined by the translator itself
- Xthis kind of code is generated when the constant
- X.B lazyfor
- Xis true.
- X.DS
- XDefault value:
- X
- X lazyfor = false;
- X.DE
- X.ne 10
- X.IP new
- XThe second and following parameters to the procedure
- X.B new
- Xwill be ignored if the constant
- X.B unionnew
- Xis false.
- X.DS
- XDefault value:
- X
- X unionnew = true;
- X.DE
- X.ne 10
- X.IP strings
- XAll identifiers and strings are stored in the translator with the special
- Xcharacter having the ordinal value
- X.B null
- Xas endmarker.
- XHence, that character can not occur in strings in the Pascal source.
- X.DS
- XDefault value:
- X
- X null = 0;
- X.DE
- X.ne 20
- X.IP types
- XThe names of predefined types are given by the constants:
- X.B inttyp ,
- X.B chartyp ,
- X.B floattyp ,
- Xand
- X.B doubletyp .
- X.DS
- XDefault value:
- X
- X inttyp = 'int';
- X chartyp = 'char';
- X floattyp = 'float';
- X doubletyp = 'double';
- X.DE
- XThe typename for real variables and functions defined by the user
- Xis given by the constant
- X.B realtyp .
- X.DS
- XDefault value:
- X
- X realtyp = doubletyp;
- X.DE
- XThe typename for procedures is given by the constant
- X.B voidtyp .
- X.DS
- XDefault value:
- X
- X voidtyp = 'void';
- X.DE
- X.ne 10
- X.IP i/o
- XThe default fieldwidths for integer and real values written on textfiles
- Xare given by the constants
- X.B intlen
- Xand
- X.B fixlen .
- X.DS
- XDefault value:
- X
- X intlen = 10;
- X fixlen = 20;
- X.DE
- X.IP types
- XA table in the translator gives the mapping from Pascal
- Xinteger subrange types to C arithmetic types.
- XThe table is initialized by code located at the end of procedure
- X.I initialize
- Xgiving the following default configuration:
- X.TS
- Xl l l
- Xr r l .
- XLow bound High bound Selected type
- X.sp
- X0 255 unsigned char
- X-128 127 char
- X0 65535 unsigned short
- X-32768 32767 short
- X-2147483647 2147483647 long
- X.TE
- END_OF_FILE
- if test 35411 -ne `wc -c <'ptoc.doc'`; then
- echo shar: \"'ptoc.doc'\" unpacked with wrong size!
- fi
- # end of 'ptoc.doc'
- fi
- echo shar: End of archive 5 \(of 12\).
- cp /dev/null ark5isdone
- MISSING=""
- for I in 1 2 3 4 5 6 7 8 9 10 11 12 ; do
- if test ! -f ark${I}isdone ; then
- MISSING="${MISSING} ${I}"
- fi
- done
- if test "${MISSING}" = "" ; then
- echo You have unpacked all 12 archives.
- rm -f ark[1-9]isdone ark[1-9][0-9]isdone
- else
- echo You still need to unpack the following archives:
- echo " " ${MISSING}
- fi
- ## End of shell archive.
- exit 0
- --
-
- Rich $alz "Anger is an energy"
- Cronus Project, BBN Labs rsalz@bbn.com
- Moderator, comp.sources.unix sources@uunet.uu.net
-